import pandas as pd
import seaborn as sns
import plotly.express as px
import matplotlib.pyplot as plt
import plotly.io as pio
pio.renderers.default = "plotly_mimetype+notebook"
For this excercise, we have written the following code to load the stock dataset built into plotly express.
stocks = px.data.stocks()
stocks.head()
| date | GOOG | AAPL | AMZN | FB | NFLX | MSFT | |
|---|---|---|---|---|---|---|---|
| 0 | 2018-01-01 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 |
| 1 | 2018-01-08 | 1.018172 | 1.011943 | 1.061881 | 0.959968 | 1.053526 | 1.015988 |
| 2 | 2018-01-15 | 1.032008 | 1.019771 | 1.053240 | 0.970243 | 1.049860 | 1.020524 |
| 3 | 2018-01-22 | 1.066783 | 0.980057 | 1.140676 | 1.016858 | 1.307681 | 1.066561 |
| 4 | 2018-01-29 | 1.008773 | 0.917143 | 1.163374 | 1.018357 | 1.273537 | 1.040708 |
Select a stock and create a suitable plot for it. Make sure the plot is readable with relevant information, such as date, values.
x = stocks.date
y = stocks.GOOG
Google, ax = plt.subplots(figsize=(10,5))
ax.plot(x,y)
ax.set_title('Stock of Google')
ax.set_xlabel('Dates')
ax.set_ylabel('Change')
ax.set_xticks([0,14,29,44,59,74,89,104])
plt.xticks(rotation=90)
plt.show()
You've already plot data from one stock. It is possible to plot multiples of them to support comparison.
To highlight different lines, customise line styles, markers, colors and include a legend to the plot.
x = stocks.date
Stock1 = stocks.GOOG
Stock2 = stocks.AAPL
Stock3 = stocks.AMZN
Stock4 = stocks.FB
Stock5 = stocks.NFLX
Stock6 = stocks.MSFT
Google, ax = plt.subplots(figsize=(15,5))
ax.plot(x,Stock1, label= 'Google', linestyle=':')
ax.plot(x,Stock2, label= 'Apple')
ax.plot(x,Stock3, label= 'Amazon')
ax.plot(x,Stock4, label= 'Facebook')
ax.plot(x,Stock5, label= 'Netflix')
ax.plot(x,Stock6, label= 'Microsoft')
ax.set_title('All of the Stocks')
ax.set_xlabel('Dates')
ax.set_ylabel('Change')
ax.set_xticks([0,14,29,44,59,74,89,104])
ax.legend()
plt.xticks(rotation=90)
plt.show()
First, load the tips dataset
tips = sns.load_dataset('tips')
tips['%tips'] = (tips.tip/tips.total_bill).replace()
tips.head()
| total_bill | tip | sex | smoker | day | time | size | %tips | |
|---|---|---|---|---|---|---|---|---|
| 0 | 16.99 | 1.01 | Female | No | Sun | Dinner | 2 | 0.059447 |
| 1 | 10.34 | 1.66 | Male | No | Sun | Dinner | 3 | 0.160542 |
| 2 | 21.01 | 3.50 | Male | No | Sun | Dinner | 3 | 0.166587 |
| 3 | 23.68 | 3.31 | Male | No | Sun | Dinner | 2 | 0.139780 |
| 4 | 24.59 | 3.61 | Female | No | Sun | Dinner | 4 | 0.146808 |
Let's explore this dataset. Pose a question and create a plot that support drawing answers for your question.
Some possible questions:
#Are there differences between male and female and lunch and dinner when it comes to giving tips?
g = sns.FacetGrid(tips, col='sex', hue='time')
g.map(sns.histplot, '%tips')
g.add_legend()
#plt.savefig('smoker.png', dpi=200)
plt.show()
Redo the above exercises (challenges 2 & 3) with plotly express. Create diagrams which you can interact with.
Hints:
df = px.data.stocks()
stocks_unpivoted = df.melt(id_vars='date',var_name='stock name',value_name='stock value')
fig = px.line(stocks_unpivoted, x='date', y='stock value', color='stock name')
#fig = px.line(df, x="date", y=["GOOG", "APPL", "AMZN", "FB", "NFLX", "MSFT"])
#this does not work on my computer even tough it should!!
#ValueError: All arguments should have the same length.
#The length of argument `y` is 6, whereas the length of previously-processed arguments ['date'] is 105
fig.show()
tips = px.data.tips()
tips['%tips'] = (tips.tip/tips.total_bill).replace()
tips.head()
fig=px.histogram(tips,x='%tips',color='time',facet_col='sex')
fig.show()
Recreate the barplot below that shows the population of different continents for the year 2007.
Hints:
#load data
df = px.data.gapminder()
df.head()
df_2007_new.head()
| year | lifeExp | pop | gdpPercap | iso_num | |
|---|---|---|---|---|---|
| continent | |||||
| Africa | 104364 | 2849.914 | 929539692 | 160629.695446 | 23859 |
| Americas | 50175 | 1840.203 | 898871184 | 275075.790634 | 9843 |
| Asia | 66231 | 2334.040 | 3811953827 | 411609.886714 | 13354 |
| Europe | 60210 | 2329.458 | 586098529 | 751634.449078 | 12829 |
| Oceania | 4014 | 161.439 | 24549947 | 59620.376550 | 590 |
df_2007 = df.query('year == 2007')
df_2007_new = df_2007.groupby('continent').sum()
fig = px.bar(df_2007_new, x='pop', y= df_2007_new.index, orientation='h', color=df_2007_new.index, text= 'pop')
fig.update_yaxes(categoryorder='total descending')
fig.show()